Add read-only Forge session MCP server#87
Conversation
b3de7fb to
f8c4ab3
Compare
f8c4ab3 to
68dc785
Compare
| | `GRAFANA_ADMIN_PASSWORD` | Grafana admin password (default: `grafana`) | | ||
| | `LANGFUSE_DOCKER_NETWORK` | External Docker/Podman network for self-hosted Langfuse when using `devtools/grafana/compose.langfuse-network.yml` (default: `langfuse_default`) | | ||
| | `CLICKHOUSE_HOST` | Langfuse ClickHouse host reachable from the Grafana container | | ||
| | `CLICKHOUSE_HOST` | Langfuse ClickHouse host reachable from Grafana | |
There was a problem hiding this comment.
We should drop this line change. It's just a change in the description of CLICKHOUSE_HOST and it's misleading
|
|
||
| | Endpoint | Purpose | | ||
| |----------|---------| | ||
| | `/api/v1/observability/tickets/{ticket_key}` | Ticket cost, tokens, latency, workflow steps, and recent observation metadata | |
There was a problem hiding this comment.
I think there is value in /api/v1/sessions/{session_id}/summary and the api/v1/observability/tickets endpoints - each provides useful information and context. However, there is a semantic clash between the sessions and tickets segments. Langfuse sessions and JIRA tickets are effectively the same thing in Forge, and having different segments creates ambiguity. We need to rethink the routing here
| raise ValueError("trace_id must not be empty") | ||
|
|
||
| client = get_langfuse_client() | ||
| trace = await _call_langfuse("trace.get", client.async_api.trace.get, normalized) |
There was a problem hiding this comment.
we should use kwarg trace_id in the call to _call_langfuse to be consistent with get_session_traces()
| return {"error": str(exc), "raw_trace_data_exposed": False} | ||
|
|
||
| @mcp.tool( | ||
| name="get_session_traces", |
There was a problem hiding this comment.
Depending on the size of the payload, get_session_traces and get_trace results in an error on the local mcp client (tested with Claude Code). The source of the error seems to be related to a ~30MB payload ceiling on Google Vertex AI or a surpassing of the context window, resulting in the mcp client dropping its connection with the forge mcp server. We need to verify if this is the true cause of the error to determine a solution, but more importantly, we need to rethink a strategy on how to expose these traces to mcp clients as the payloads are large even for single traces.
| `cwd` to the local Forge repository path: | ||
|
|
||
| ```json | ||
| { |
There was a problem hiding this comment.
The documentation was enough here for me to figure out how to define an mcp server using OpenCode. This is how it works with OpenCode
"forge-session": { "type": "local", "command": ["uv", "run", "forge-session-mcp"], "cwd": "/path/to/forge", "enabled": true }
I haven't used other MCP clients other than Claude and OpenCode to know if using local type or providing args in the command list is more typical. Just providing this example for informational purposes should it help clarify the documentation
| metadata = _metadata(trace) | ||
| step = str(metadata.get("workflow_step") or "unknown") | ||
| cost = _number(_get_attr(trace, "total_cost", "totalCost")) | ||
| input_tokens = _number(_get_attr(trace, "input_tokens", "inputTokens")) |
There was a problem hiding this comment.
input_tokens, output_tokens, and total_tokens are not being set properly. Langfuse doesn't provide these figures in the traces API. However, they can be queried using the observations or metrics API. See get_model_usage() for an example.
| limit=limit, | ||
| order_by="timestamp.desc", | ||
| fields="core,metrics,io", | ||
| ) |
There was a problem hiding this comment.
fields="core,metrics.io creates an error on the LangFuse backend here because we are not passing in a session_id
e.g. err:
"detail": "Langfuse trace.list failed: ...
| | `get_session_traces` | Tool that returns Langfuse traces for one Jira ticket session; full trace data by default | | ||
| | `get_trace` | Tool that returns one full Langfuse trace by trace id | | ||
| | `get_model_usage` | Tool that returns aggregate model calls, cost, tokens, and latency | | ||
| | `get_workflow_funnel` | Tool that returns workflow-step issue, trace, cost, and latency aggregates | |
There was a problem hiding this comment.
get_workflow_funnel is currently implemented without filtering by a session. Therefore, it returns a list of steps for all sessions which doesn't sound useful on the surface. Can you clarify the intent of this endpoint?
|
|
||
| The MCP responses include `raw_state_exposed: false` or | ||
| `raw_trace_data_exposed: false` for curated responses. Full trace responses use | ||
| `full_trace_data_exposed: true`. |
There was a problem hiding this comment.
Based our research and findings into large trace payloads, we should reconsider how to expose the full_trace_data_exposed: true to make sure it too doesn't create large payload issues
|
Hi @eshulman2 great work here. This is jammed packed of great features - we just need to refine them. My main concerns involve intent and semantics more than the implementation, namely, the routing schema and reorganizing the features/functionality in a more coherent way (see other comments for those details). In addition to your work, I think it would be useful to add a subsetting mechanism to Finally, do we plan to add authz for externally hosted MCP servers in subsequent PR's? |
Summary
GET /api/v1/sessions/{ticket_key}/summaryforge-session-mcpout of the checked-inmcp-servers.jsonso Forge agents do not load session-inspection tools themselvesGRAFANA_BASE_URLconfig so summaries can include dashboard linksObservability Data Layer
/api/v1/observability/*GET /api/v1/observability/tickets/{ticket_key}/tracesGET /api/v1/observability/traces/{trace_id}get_session_tracesandget_traceSafety
full_trace_data_exposed: truelogs_limit <= 50Docs
/api/v1/sessions/{ticket_key}/summaryin the API referencedocs/reference/session-inspection.mdwith HTTP and Claude MCP setup instructionsTests
jq empty mcp-servers.jsonUV_CACHE_DIR=/tmp/uv-cache uv run ruff check src/forge/observability/access.py src/forge/api/routes/observability.py src/forge/mcp/session.py src/forge/api/routes/__init__.py src/forge/main.py tests/unit/observability/test_access.py tests/unit/api/routes/test_observability.py tests/unit/mcp/test_session_server.pyUV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/observability/test_access.py tests/unit/api/routes/test_observability.py tests/unit/mcp/test_session_server.py tests/unit/api/routes/test_sessions.py tests/unit/sessions/test_summary.py